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Abstract 

In this chapter a general mathematical framework for probabilistic 
theories of operationally understood circuits is laid out. Circuits are com- 
prised of operations and wires. An operation is one use of an apparatus 
and a wire is a diagrammatic device for showing how apertures on the 
apparatuses are placed next to each other. Mathematical objects are de- 
fined in terms of the circuit understood graphically. In particular, we do 
not think of the circuit as sitting in a background time. Circuits can be 
foliated by hypersurfaces comprised of sets of wires. Systems are defined 
to be associated with wires. A closable set of operations is defined to be 
one for which the probability associated with any circuit built from this 
set is independent both of choices on other circuits and of extra circuitry 
that may be added to outputs from this circuit. States can be associated 
with circuit fragments corresponding to preparations. These states evolve 
on passing through circuit fragments corresponding to transformations. 
The composition of transformations is treated. A number of theorems are 
proven including one which rules out quaternionic quantum theory. The 
case of locally tomographic theories (where local measurements on a sys- 
tems components suffice to determine the global state) is considered. For 
such theories the probability can be calculated for a circuit from matrices 
pertaining the operations that comprise that circuit. Classical probability 
theory and quantum theory are exhibited as examples in this framework. 



1 Introduction 

Prior to Einstein's 1905 paper [T] laying the foundations of special relativity it 
was known that Maxwell's equations are invariant under the Lorentz transfor- 
mations. Mathematically the Lorentz transformations are rather complicated 
and it must have been unclear why nature would choose these transformations 
over the rather more natural looking Galilean transformations. Further, there 
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was a understanding of the physical reasons for Gahlean transformations in 
terms of boosts and the additive nature of velocities. We find ourselves in a 
similar situation today with respect to quantum theory. Regarded as a proba- 
bilistic theory, it is much more complicated from a mathematical point of view 
than the rather natural equations of classical probability theory. And further, 
we can motivate classical probability by ordinary reasoning by imagining that 
the probabilities pertain to some underlying mutually exclusive set of possibil- 
ities. The situation with respect to the Lorentz transformations was resolved 
by Einstein when he showed that they follow from two very reasonable postu- 
lates: that the laws of physics are the same in all reference frames and that 
the speed of light in vacuum is independent of the motion of the source. Once 
we see Einstein's reconstruction of the Lorentz transformations we have a sense 
that we understand why, at a fundamental level, nature prefers these over the 
mathematically simpler Galilean transformations. We need something similar 
for quantum theory [2]. 

The subject of reconstructing quantum theory has seen something of a revival 
in the last decade [2j . Generally, to reconstruct quantum theory we write down a 
set of basic axioms or postulates which are supposed to be well motivated. They 
should not appear unduely mathematical. Then we apply these in the context of 
some framework for physical theories and show that we obtain quantum theory. 
This framework itself should be well motivated and may even follow from one 
or more of the given postulates. 

The purpose of this chapter is to set up one such framework. This will 
be a framework for general probabilistic operational theories. There is a large 
literature on this (see Section [2]). To construct the mathematics of such a 
framework we must first specify what we mean by our operational structure. 
Only then can we add probability. The mathematics associated with the part 
of this where we add probability has become fairly sophisticated. However, 
a fairly simple minded point of view is usually taken with respect to setting 
up the operational structure upon which the whole endeavor is founded. The 
picture normally adopted is of a system passing sequentially through various 
boxes representing operations or, more generally, of many systems where, at 
any given time, each system passes through a box with, possibly, the same box 
acting on more than one system at once (see Fig. [1]). This simple picture 
is problematic for various reasons. First, there is no reason why the types and 
number of systems going into a box be equal to the types and number emerging. 
Second, the notion of system itself is not fully operational. Third (and most 
significantly), this circuit is understood as being embedded in a background 
Newtonian time and this constitutes structure in addition to a purely graphical 
interpretation of the diagram (it matters how high on the page the box is placed 
since this corresponds to the time at which the operation happens). 

To deal with these three points we set up a more general framework where 
(1) we allow the number and types of systems going into a box to be different 
from the the number and types of systems going out (2) give an operational 
definition of the notion of system (3) define our temporal concepts entirely in 
terms of the graphical information in the diagram. This third point gives rise to 
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Figure 1: A naive picture of operationalism. Systems pass through boxes with respect 
to a background time. 

a natural notion of spacelike hypersurface in such a way that multiple foliations 
are possible. Hence, we call this a foliable operational structure. 

It is worth being careful to formulate the operational structure well since 
such structures form a foundation for general probabilistic theories. Different 
operational structures can lead to different probabilistic frameworks. Once we 
have an operational structure, we can introduce probabilities. We then proceed 
along a fairly clear path introducing the notions of preparation, transformation, 
measurement, and associating mathematical objects with these that allow prob- 
abilities to be calculated. This gives a example of how an operational framework 
can be a foundation for a general probabilistic theory. The foliable framework 
presented here is sufficient for the formulation of classical probability theory, 
quantum theory and potentially many theories beyond. 

However, the foliable operational structure still, necessarily, has a notion of 
definite causal structure - when a system passes between two boxes that corre- 
sponds to a timelike separation (or null in the case of photons) . We anticipate 
that a theory of quantum gravity will be a probabilistic theory with indefinite 
causal structure. If this is true then we need a more general framework than the 
one presented here for quantum gravity. Preliminary ideas along this line have 
been presented in [1]. In future work it will be shown how the approach taken 
in this chapter can be generalized to theories without definite causal structure 
- that is non-foliable theories. First we must specify a sufficiently general non- 
foliable operational structure and then add probabilities (see [5] for an outline 
of these ideas). 

2 Related work 

The work presented here is a continuation of work initiated by the author in [B] 
in which a general probabilistic framework, sometimes called the r-p framework 
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(because these vectors represent effects and states), was obtained for the pur- 
pose of reconstructing quantum theory from some simple postulates. In |4l [7] 
the author adapted the r-p for the purpose of describing a situation with in- 
definite causal structure to obtain a general probabilistic framework that might 
be suitable for a theory of quantum gravity. The idea that states should be 
represented by joint rather than conditional probabilities used in these papers 
is also adopted here. Preliminary versions of the work presented here can be 
seen in [SJ [5] • 

In this work we consider arbitrary foliations of circuits. Thus wc take a more 
space-time based approach. There are many other space-time based approach in 
the context of a discrete setting in the literature (particularly work on quantum 
gravity). Sorkin builds a discrete model of space-time based on causal sets [9]. 
Work in the consistent (or decoherent) histories tradition [lOl [Til IHl US] takes 
whole histories as the basic objects of study. A particularly important and 
influential piece of work is the quantum causal histories approach developed 
by Markopoulou [H]. In this, completely positive maps are associated with 
the edges of a graph with Hilbert spaces living on the vertices. Blute, Ivanov, 
and Panangaden [15] (see also [16]), motivated by Markopoulou's work, took 
the dual point of view with systems living on the edges (wires) and completely 
positive maps on the vertices. The work of Blute et al., though restricted to 
quantum theory rather than general probabilistic theories, bares much similarity 
with the present work. In particular, similar notions of foliating circuits are to 
be found in that paper. Leifer [17] has also done interesting work concerning 
the evolution of quantum systems on a causal circuit. 

Abramsky and Coecke showed how to formulate quantum theory in a cate- 
gory theoretic framework [18] (see also [1^). This leads to a very rich and beau- 
tiful diagramatic theory in which many essential aspects of quantum theory can 
be understood in terms of simple manipulations of diagrams. The diagrams can 
be understood operationally. Ideas from that work are infused into the present 
approach. Indeed, in category theoretic terms, the diagrams in this work can 
be understood as symmetric monoidal categories. 

The r-p framework in [B] is actually simple example of a framework for 
general probability theories going back originally to Mackey (5D] and has been 
worked on (and often rediscovered) by many others since including Ludwig [21] , 
Davies and Lewis ^22j, Araki [23], Gudder [24], Foulis and Randall [25^. 

Barrett elaborated on r-p framework in [26]. He makes two assumptions - 
that local operations commute and that local tomography is possible (whereby 
the state of a composite system can be determined by local measurements). In 
this work we do not make either assumption. The first assumption, in any case, 
would have no content since we are interested in the graphical information in 
a circuit diagram and interchanging the relative height of operations does not 
change the graph. Under these assumptions, Barrett showed showed that com- 
posite systems can be associated with a tensor product structure. We recover 
this here for the special case when we have local tomography but the more 
general case is also studied. In his paper Barrett shows that some properties 
which are thought to be specific to quantum theory are actually properties of 
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any non-classical probability theory. 

More examples of this nature are discussed in various papers by Barrett, 
Barnum, Leifer, and Wilce in [571[5S1[5S] and in [301 [31] the general probability 
framework is further developed. 

The assumption of local tomography is equivalent to the assumption that 
Kab — KaKb where Kab is the number of probabilities needed to specify the 
state of the composite system ab and Ka (Kb) is the number needed to describe 
system a {b) alone (this is the content of Theorem 5 below). Theories having 
this property were investigated in a paper by Wootters [32] (see also (33]) in 
1990 who showed that they are consistent with the relation Ka = where Na 
is the number of states that can be distinguished in a single shot measurement 
(this was used in |6] as part of the axiomatic structure). 

In 1994 Popescu and Rohrlich [34] exhibited correlations that maximally 
violate Bell's inequality but do not permit signalling. These correlations are 
more nonlocal than quantum correlations. Barrett asked what principles would 
be required to prescribe such no-signalling correlations to the quantum limit 
[25] . Pawlowski et al. [35] have shown that the Popescu Rohrlich correlations 
(and, indeed, any correlations more nonlocal than quantum theory) violate a 
very natural principle they call the information causality principle. And Gross 
et al. [3B] have shown, as speculated by Barrett [5S], that the dynamics in any 
theory allowing Popescu Rohrlich correlations are trivial. 

Another line of work in this type of framework has been initiated by D'Ariano 
in [37j who has a set of axioms from which he obtains quantum theory. In a very 
recent paper by Chiribella, D'Ariano, and Perinotti [38] set up a general prob- 
abilistic framework also having the local tomography property. Like Abramsky, 
Coecke and co-workers, Chiribella et al. develop a diagrammatic notation with 
which calculations can be performed. They show that theories having the prop- 
erty that every mixed state has a purification have many properties in common 
with quantum theory. 

There have been many attempts at reconstructing quantum theory, not all of 
them in the probabilistic framework of the sort considered in the above works. 
A recent conference on the general problem of reconstructing quantum theory 
can be seen at 0. 

3 Essential concepts 
3.1 Operations and circuits 

The basic building block will be an operation. An operation is one use of an 
apparatus. An operation has inputs and outputs, and it also has settings and 
outcomes (see Fig. [2])- The inputs and outputs are apertures which we imagine 
a system can pass through. Each input or output can be open or closed. For 
example, we may close an output by blocking the aperture (we will explain the 
significance of this later). The settings may be adjusted by knobs. The outcomes 
may be read off a meter or digital display or correspond to a detector clicking 
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Figure 2: An operation iiaving Icnob settings, measurement outcomes, and inputs (at 
the bottom of box) and outputs (at the top). 

or lights flashing for example. It is possible that there is no outcome readout on 
the apparatus in which case we can simply say that the set of possible outcomes 
has only one member. The same apparatus may be used multiple times in a 
given experiment. Each separate use constitutes an operation. 

Each input or output is of a given type. We can think of the type as being 
associated with the type of system that we imagine passes through. The type 
associated with an electron is different than that associated with a photon. 
However, from an operational point of view, talk of electrons or photons is a 
linguistic shortcut for certain operational procedures. We might better say that 
the type corresponds to the nature and intended use of the aperture. Operations 
can be connected by wires between outputs and inputs of the same type. These 
wires do not represent actual wires but rather are a diagrammatic device to 
show how the apertures on the operations are placed next to one another - this 
is something an experimentalist would be aware of and so constitutes part of 
the operational structure. If we actually have a wire (an optical fibre say) this 
wire should be thought of as an operation itself and be represented by a box 
rather than a wire. Likewise, passage through vacuum also should be thought 
of as an operation. The wires show how the experiment is assembled. Often a 
piece of self-assembly furniture (from Ikca for example) comes with a diagram 
showing an exploded view with lines connecting the places on the different 
parts of the furniture that should be connected. The wires in our diagrams are 
similar in some respects to the lines in these diagrams (though an experiment 
is something that changes in time and so the wires represent connections that 
may be transient whereas the connections in a piece of furniture are static). 

There is nothing to stop us trying to match an electron output with a photon 
input or even a small rock output with an atom input (this would amount to 
firing small rocks at an aperture intended for individual atoms). However this 
would fall outside the intended use of the apparatus and so we would not expect 
our theory to be applicable (and the apparatuses may even get damaged) . 

We will often refer to tracing forward through a circuit. By tracing forward 
we mean following a path through the circuit from the output of one operation, 
along the wire attached, to the input of another operation and then from an out- 
put of that operation, along the wire attached, to the input of another operation 
and so on. Such paths are analogous to future directed time-like trajectories in 
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Figure 3: A bunch of operations wired together form a circuit if there are no open 
inputs or outputs left over. We require that there be no closed loops as we trace 
forward. We have not drawn in the settings or the outcomes (these will usually be 
taken to be implicit in these circuit diagrams). There are some closed inputs and 
outputs. 

spacetime physics. 

We require that there can be no dosed loops as we trace forward (i.e. that 
we cannot get back to the same operation by tracing forward) . This is a natural 
requirement given that an operation corresponds to a single use of an apparatus 
(so long as there are no backward in time influences). It is this requirement of 
no closed loops that will enable us to foliate. 

In the case that we have a bunch of operations wired together with no open 
inputs or outputs left over then we will say we have a circuit (we allow circuits 
to consist of disconnected parts). An example of a circuit is shown in Fig. [S] 

As mentioned above, we assume that any input or output can be closed. 
This means that if we have a circuit fragment with open inputs and outputs 
left over we can simply close them to create a circuit. This is useful since the 
mathematical machinery we will set up starts with circuits. Closing an output 
can be thought of as simply blocking it off. The usefulness of the notion of 
closing an output relates to the possibility of having no influences from the 
future. This will be discussed in Sec. 13.31 Closing an input can be thought 
of as sending in a system corresponding to the type of input in some fiducial 
state. We will not make particular use of the notion of closing a input (beyond 
that it allows us to get circuits from circuit fragments) and so we need not be 
more specific than this. We could set up the mathematical machinery in this 
chapter without assuming that we can close inputs and outputs without much 
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Figure 4: A set of wires is synchronous if it is not possible to get from one to another by 
tracing forward. On the left we see an example of a set of wires which are synchronous 
and on the right an example of a set which is not. 

more effort but the present approach has certain pedagogical advantages. 
3.2 Time in a circuit 

We do not think of time as something in the background but rather define it 
in terms of the circuit itself. We take the attitude that two circuits having 
the same circuit diagram (in the graphical sense) are equivalent. Hence there 
is no physical meaning to sliding operations along wires with respect to some 
background time. This is a natural attitude given the interpretation of wires 
as showing how apertures are placed next to each other rather than as actual 
wires. 

We define a synchronous set of wires to be any set of wires for which there 
does not exist a path from one wire to another in the set if we trace forward 
along wires from output to input. See Fig. |4]for examples. 

We call a synchronous set of wires a hypersurface, H, if it partitions the 
circuit into two parts, 'y]^ and that are not connected other than by wires in 
the hypersurface. Each of the wires in the hypersurface has an end connected 
to an output (the "past") and an end connected to an input (the "future"). 

is the part of the circuit to the "past" and is the part of the circuit to 
the "future" of the hypersurface. A hypersurface, as defined here, is the circuit 
analogue of a spacelike hypersurface in spacetime physics. 

We say two hypersurfaces are distinct if at least some of the wires are differ- 
ent. We say that hypersurface H2 is after hypersurface Hi if the intersection of 
the past of Hi (this is ) and the future of H2 (this is ) has no operations 
in common. If we can get from every wire in H2 by tracing forward from a wire 
in Hi then H2 is after Hi (there are, however, examples of H2 after Hi that 
are not like this). 

A foliation is a ordered set of hypersurfaces {Ht} such that Ht+i is after 
Ht- A complete foliation is a foliation that includes every wire in the circuit. 
It is easy to prove that complete foliations exist for every circuit. Define an 
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initial wire to be one connected to an output of an operation having no open 
inputs. Take the set of all initial wires (this cannot be the null set as long 
as we have at least one connecting wire in this circuit). These wires form a 
hypersurface Hi. Consider the set of operations for which these wires form 
inputs. Since, according to the wiring constraints, there can be no closed loops, 
there must exist at least one operation in this set which has no inputs from 
wires connected to outputs of other operations in this set. Substitute the wires 
coming out of one such operation for the wires going into the operation in Hi to 
form a new spacelike hypersurface H2 (this is after Hi). This can be repeated 
until all wires have been included forming a complete foliation. This proves 
that complete foliations always exist. There can, of course, exist other complete 
foliations that are not obtainable by this technique. 

Although we do not use a notion of a background time to time-order our 
operations, it is the case that these tollable structures are consistent with a 
Newtonian notion of an absolute background time. Simply choose one complete 
foliation and take that as corresponding to our Newtonian time. They are, 
however, more naturally consistent with relativistic ideas since, for a general 
circuit, there exist multiple foliations. 

3.3 Probability 

Now we are in a position to introduce probability into the picture. Probability 
is a deeply problematic notion from a philosophical point of view [39^ . There are 
various competing interpretations. All these interpretations attempt to account 
for the empirical fact that, in the long run, relative frequencies are stable - that 
if you toss a coin a million times and get 40% heads, then if you toss the same 
coin a million times again, you will get 40% heads again (more or less) . It is not 
the purpose of this chapter to solve the interpretational problems of probability 
and so we will adopt the point of view that probability is a limiting relative 
frequency. This gives us the basic mathematical properties of probability: 

1. Probabilities are non- negative 

2. Probabilities sum to 1 over a complete set of mutually exclusive events. 

3. Bayes rule, Prob(A&B) ^ Prob(A|B)Prob(B), applies. 

We could equally adopt any other interpretation of probability that gives us 
these mathematical properties and set up the same theoretical framework. 

Typically an experimentalist will have available to him some set of opera- 
tions, O, he can use to build circuits. On each operation in the circuit are various 
possible settings (among which the experimentalist can choose) and various out- 
comes one of which will happen. We say the circuit is setting specified if each 
operation is given. We say the circuit is setting-outcome specified if the setting 
and outcome on each operation is specified. A setting-outcome specified circuit 
corresponds to what happens in single run of the experiment. We define: 
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Figure 5: A set of operations is closable if any circuit built from it has a probability 
associated with it depending only on that circuit and if that probability is independent 
of any extra circuitry. 



Closable sets of operations. A set of operations, O, is said to be closable if, 
for every setting-outcome specified circuit that can be built from O, 

(i) there is a probability depending only on the particulars of this circuit that 
is independent of choices made elsewhere, and 

(ii) if we open closed outputs on this circuit and add on extra circuitry then 
the probability associated with the original bit of setting-outcome specified 
circuitry (ignoring outcomes associated with the extra bit) is unchanged 
for any such extra circuitry we can add (see Fig.[S]). 

Part (i) of this definition concerns choices made elsewhere. These could be 
choices of settings on operations in other circuits (disjoint from this one), choices 
of what circuits to build elsewhere, or choices not even having to do with cir- 
cuits built from the given set of operations. We might also have said that the 
probability associated with a setting-outcome specified circuit is independent 
of the outcomes seen in other circuits. This is a very natural assumption since 
otherwise the probability attached to a particular circuit could be different if we 
restrict our attention to the case where we had seen some particular outcomes 
on other circuits. In the case where (i) and (ii) hold and also the probability is 
independent of the outcomes in other circuits we will say the set of operations 
is fully closable. It turns out we can go a very long way without assuming this. 
Further, we will prove that in Section 1^?^ that if a very natural condition holds 
then closable sets are, in any case, fully closable. 

Part (ii) imposes a kind of closure from the future - choices on operations 
only connected to a part of a circuit by outgoing wires (or even choices of what 
circuitry to place after outgoing wires) do not effect the probabilities for this 
circuit part. This could almost be regarded as a definition of what we mean by 
wires going from output to input. We do not regard it as a definition though 
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Figure 6: If there is no-signalling between A and B then the correct circuit is that 
shown in (a). However, if there is signalling from B to A then (a) could not be 
the correct circuit. Instead, the correct circuit would have to be something like that 
shown in (b). The framework is perfectly capable of incorporating signalling. Hence 
the assumption of no-signalling is not an assumption of the framework but rather 
corresponds to asserting that the correct circuit for the given no-signalling situation 
is the one in (a). 

since it corresponds to a rather global property of circuits rather than a property 
specific to a given wire. 

It is interesting to consider examples of sets of operations which are not 
closable. Imagine for example that, among his operations, an experimentalist 
has an apparatus which he specifies as implementing an operation on two qubits 
but actually it implements an operation on three qubits - there is an extra input 
aperture he is unaware of. If he builds a circuit using this gate then an adversary 
can send a qubit into the extra input which will effect the probability for the 
circuit. Thus, the probability would depend on a choice made elsewhere. In 
this case we could fix the situation by properly specifying the operation to 
include the extra input. The notion of closability is important since it ensures 
that the experimentalist has full control of his apparatuses. It is possible, at 
least in principle, that a set of operations cannot be closed by discovering extra 
apertures. By restricting ourselves to physical theories that admit closability 
we are considering a subset of all possible theories (though a rather important 
one). 

3.4 Can no-signalling be an axiom? 

Many authors have promoted no-signalling as an axiom for Quantum Theory. 
It may appear that part (ii) of the definition of closability sneaks in a no- 
signalling assumption here. In fact this is not the case. Indeed, a no-signalling 
axiom would actually have limited content in this circuit framework. Consider 
the circuit shown in Fig.[Sl[a). A no-signalling axiom would assert that a choice 
at one end, B say, cannot effect the probability of outcomes at the other end, 
A. However, this is actually implied by part (ii) since B is connected to the AC 
part of the circuit by an out-going wire. Hence, it looks like we are assuming 
no-signalling. However, this is not an assumption for the framework but only a 
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consequence of asserting that the correct circuit is the one in Fig. [5]Ja). Imag- 
ine that there actually was signalling such that probabilities for the AC part 
of the circuit depended on a choice at B then, under the assumption that the 
operations are drawn from a closable set, it is clear that the situation cannot 
be described by Fig. [D^a). Rather, the situation would have to be that shown 
in Fig. [S]Jb) where there is an extra wire going from B to A (or something like 
this but with more structure). This framework is perfectly capable of accommo- 
dating both signalling and no-signalling situations by appropriate circuits and, 
consequently, we are not sneaking in a no-signalling assumption. 

This reasoning leads us to question whether a no-signalling axiom could do 
any work at all. Indeed it is often unappreciated in such axiomatic discussions 
that the usual framework of Quantum Theory does allow signalling. One can 
write down nonlocal Hamiltonians which will, for example, entangle product 
states. Of course, in Quantum Field Theory one incorporates a no-signalling 
property by demanding that field operators for space- like separated regions com- 
mute so that such nonlocal Hamiltonians are ruled out. However, this is an ex- 
ample where we have a given background. In general, a no-signalling axiom with 
respect to some particular given background would restrict the type of circuits 
we allow. For example the circuit in Fig. ^h) would not be allowed unless A 
was in the future light cone oi B. In quantum field theory we have an example 
where a no-signalling axiom with respect to a Minkowski background restricts 
the types of unitary evolution and measurement that are allowed. However, it 
is often claimed that the abstract Hilbert space framework itself (which makes 
no mention of Minkowski spacetime) can be derived using no-signalling as one 
of the axioms. It is this more ambitious claim we question. In fact we will see 
that we can define this abstract framework of quantum theory for any circuit 
(as long as there are no closed loops) including no-signalling and signalling sit- 
uations (as in Fig. [6l^a) and (b)). Hence a no-signalling axiom clearly cannot be 
regarded as a constraint on this abstract framework. 

This criticism of the usefulness of no-signalling as an axiom does not apply 
to a recently proposed generalisation of this principle in an extraordinary paper 
by Pawlowski et al. |35j . They introduced the information causality principle. 
It was shown that this very compelling principle limits violations of the Bell 
inequality to the quantum limit. Imagine that Alice receives n classical bits of 
information. She communicates m classical bits to Bob. Bob is expected to 
reveal the value of one of the n classical bits though neither Alice or Bob know 
which one this will be in advance. The information causality principle is that 
Alice and Bob can only be successful when n is less than or equal to m. For 
m = this is the no-signalling assumption we have criticized. The information 
causality principle can be read as implying that if the task cannot be achieved 
for TO = then it cannot be achieved for any other value of m. This principle 
would be useful in prescribing what is possible in the framework described in 
this chapter. Consider two fragments of a circuit that cannot be connected by 
tracing forward (these fragments are analogous to spacelike separated regions). 
The information causality principle implies there is no way of accomplishing the 
above task for any to with n > m between these two circuit fragments. That we 
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cannot do this for m = is already implied (assuming that we have the correct 
circuit). 

Although a simple no-signalling "across space" principle is of limited use 

for the above reasons, we do employ what might be regarded as a no-signalling 
backward in time principle since we do not allow closed loops in a circuit. 

3.5 Systems and composite systems 

We wish to give an operational definition of what we mean by the notion of 

a system. We may find that whenever we press a button on one box, a light 
goes on on another box. We can interpret this in terms of a system passing 
between the two boxes. We find this happens only when we place the boxes in a 
certain arrangement with respect to one another (which we think of as aligning 
apertures). Given this we clearly want to associate systems with wires. Hence, 
we adopt the following definition: 

A system of type abc ... is, by definition, associated with any set 
of synchronous wires of type a,b,c, . . . in any circuit formed from a 
closable set of operations. 

We may refer to a system type corresponding to more than one wire by a single 
letter. Thus we may denote the system type abc by d. 

The usefulness of the notion of closable sets of operations is it that it leads to 
wires being associated with the sort of correlation we expect for systems given 
our usual intuitions about what systems are. Nevertheless, our definition of 
system is entirely operational since wires are defined operationally. 

It is common to speak of composite systems. We define a composite system 
as follows: 

A composite system, AB, is associated with any two systems 
(each associated with disjoint sets of wires) if the union of the sets 
of wires associated with system A and system B forms a synchronous 
set. 

This definition generalizes in the obvious way for more than two systems. A 
system of type aabc can be regarded as a composite of systems of type aa and 
be or a composite of systems of type aac and b, or a composite of systems of 
type ac, a and b to list just a few possibilities. Systems associated with a single 
wire cannot be regarded as composite. 

A hypersurface consists of synchronous wires and so can be associated with a 
system (or composite system) . A complete foliation can therefore be associated 
with the evolution of a system through the circuit (though the system type can 
change after each step). This evolution can also be viewed as the evolution of a 
composite system. 
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Figure 7: A circuit fragment is a part of the circuit having inputs wires in a syn- 
chronous set and output wires also in a synchronous set. A circuit can be divided 
up many ways into circuit fragments corresponding to preparations (no open inputs), 
transformations, and effects (no open outputs). 



4 Preparations, transformations, and effects 
4.1 Circuit fragments 

We can divide up a circuit into fragments corresponding to preparations, trans- 
formations, and effects as shown in Fig. [71 By the term circuit fragment we 
mean a part of a circuit (a subset of the operations in the circuit along with the 
wires connecting them) having inputs coming from a synchronous set of wires 
and outputs going into a synchronous set of wires. We allow lone wires in a 
circuit fragment (wires not connected to any operations in the fragment). An 
example of a lone wire is see in the circuit fragment in the rectangle on the left 
in Fig.[71 The lone wire corresponds to the identity transformation on that sys- 
tem and contributes an input and output to the circuit fragment. Generally, we 
take the term circuit fragment to imply that the settings and outcomes at each 
operation associated with these circuit fragments have been specified. A circuit 
fragment is, essentially, an operation at a course grained level. Preparations 
correspond to a circuit fragment having outputs but no open inputs. Transfor- 
mations have inputs and outputs. Effects have inputs but no open outputs . 
Note that preparations and effects are special cases of transformations. 
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4.2 States 



A preparation prepares a system. For a given type of system there will be many 
possible preparations. We will label them with a. This label tells us what circuit 
fragment is being used to accomplish the preparation (including the specification 
of the knob settings and outcomes on each operation). Associated with each 
preparation for a system of type a will be a state (labeled by a G Prep^). We 
can build a circuit having this preparation by adding an effect for a system 
of type a. There are many possible effects labeled by /3 G Effa. The label (5 
tells us what circuit fragment is used to accomplish the effect along with the 
knob settings and outcomes at each operation. Associated with the circuit is a 
probability p°'^ . We define the state associated with preparation a to be that 
thing represented by any mathematical object that can be used to calculate p"'^ 
for all /?. We could represent the state by the long list 



/?€Eff„ 



(1) 



However, in general, physical theories relate physical quantities. Hence, it is only 
necessary to list a subset of these quantities where the remaining quantities can 
be calculated by the equations of the physical theory. We call the forming of 
this subset of quantities physical compression. In the current case, we expect 
the probabilities in this list to be related. We consider the maximum amount of 
physical compression that is possible by linear means. Thus we write the state 
as 

/ : \ 



Pa 



,a/3 



/? G n„ c Eff„ 



(2) 



where there exist vectors r" such that 



paP = . for all a G Prep„, /3 G Efff, 



(3) 



We call the set the fiducial set of effects for a system of type a. The choice 
of fiducial set need not be unique - we simply make one choice and stick with it. 
Since wc have applied as much linear physical compression as possible |$1„ | is the 
minimum number of probabilities required to calculate all the other probabilities 
by linear equations. The vectors are associated with effects on a system of 
type a. For the fiducial effects they consist of a 1 in the (3 position and O's 
elsewhere. 

An important subtlety here is that we define states in terms of joint rather 
than conditional probabilities. This makes more sense for the circuit model 
since, generally, we want to calculate a probability for a circuit. If we want to 
calculate conditional probabilities we can use Bayes' rule in the standard way. 
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4.3 Transformations 



Now consider a preparation a which prepares a system of type a in state p" 
followed by a transformation /? which outputs a system of type b. We can regard 
the preparation and transformation, taken together, as a new preparation a/3 
for a system of type b with state p^'^. Now follow this by a fiducial effect 7 € O;,. 
The probability for the circuit af3j can be written 

P"''^ = r?-pf = rr-p: (4) 

where r^''' is the effect vector associated with the measurement consisting of the 
transformation /3 followed by the effect 7. Given the special form of the fiducial 
effect vectors it follows that the state transforms as 

pf = 'z^aP: (5) 

where is a x |r2o| matrix such that its (3 row is given by the components 
of the effect vectors r^^. We use a subscript, a, for the inputted system type 
and a pre-superscript, b, for the outputted system type. Hence, a general trans- 
formation is given by a matrix acting on the state. If the matrix transforms 
from one type of system to another type of system with a different number of 
fiducial effects then it will be rectangular. 

The general equation for calculating probabilities is 

p^,, = {rlf'Z^^p: (6) 

where T denotes transpose. If we have more than one transformation then, by 
a clear extrapolation of the above reasoning, we can write 

pa0^S ^ (rf p^ (7) 

and so on. Now since the Z matrices can be rectangular we can think of and 
(r^)^ as instances of a transformation matrix. The state p" can be thought of 
as corresponding to the transformation which turns a null system (no system at 
all) into a system outputted by this preparation and we can change our notation 
to "'Z" instead (a column vector being a special case of a rectangular matrix). 
Likewise we can change our notation for the row vector (rf)^ to Z^ (a row 
vector being a special case of a rectangular matrix) . Then we can write 

pa0jS ^ c^7 b^P a^a ^g^ 

The agreement of output and input system types is clear (by matching pre- 
superscript with subscripts between the Z's). 

The label a labels the circuit fragment along with the knob settings and the 
outcomes. Sometimes it is useful to break these up into separate labels. Thus 
we write 

a = {r,ip,l) (9) 

where T denotes the circuit fragment before the settings and outcomes are 
specified, 9? denotes the settings on the operations, and I denotes the outcomes. 
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In particular, this means that we can notate the effects associated with the 
different outcomes of a measurement as Ui = {J-, 

Let L be the set of possible outcomes U for a given measurement (with fixed 
T and f). We can subdivide the set L into disjoint sets Lk where UfcLfc — L. 
We could choose to be ignorant of the actual outcome k and rather only record 
which set Lk it belongs to. In this case the transformation effected can be 
denoted a^. Since we have used linear compression, we must have 

"z:" = "z:- (10) 

where li is the set of i's corresponding to the Z^'s in Lk- Since we can always 
choose to be ignorant in this way, we must include such transformations in the 
set of allowed transformations. 

The matrices corresponding to the set of allowed transformations must be 
such that, when closed expressions such as ([8]) are calculated, they always give 
probabilities between and 1. This is an important constraint on this frame- 
work. 

4.4 The identity transformation 

One transformation we can consider is where we do nothing. The wires coming 
in are the same as the wires coming out and no operation has intervened. We 
will denote this transformation by 0. Then we have, for example, 

"Z'a (11) 

This is a \ila\ x \^a\ matrix and must be equal to the identity since, as long as it 
is type matched, it can be inserted as many times as we like into any expression 
where non-trivial transformations act also. 

4.5 The trace measurement 

One effect we can perform on a preparation a is to close all outputs. This forms 
a circuit and hence there is an associated probability, p°'~ (where — denotes 
that the outputs have been closed). This is an effect and hence we must have 

P"- = r- ■ p2 (12) 

where the vector corresponds to this effect. We call this the trace measure- 
ment (terminology borrowed from quantum theory where this effect corresponds 
to taking the trace of the density matrix) . It follows from part (ii) of the con- 
dition for a closable set of operations that this is the probability associated 
with the preparation part of the circuit even if the outputs are open and more 
circuitry is added. 

In the case that r^^ • p" = 1 we say that the state is of norm one. In general 
we do not expect states to be of norm one since they consist of joint rather than 
conditional probabilities and hence we require only that < • p" < 1. 
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We can normalize a state by dividing by • p". We denote the normalized 

state by 

We cannot guarantee that p" belongs to the set of aUowcd states (i.e. that there 
exists a state for which = Pa) since preparations for all states parallel to p^ 
may be intrinsically probabilistic. 



5 Mixtures 

5.1 Forming Mixtures 

Imagine that we have a box with a light on it that can flash and an aperture 
out of which a system of type a can emerge. With probability Aj we place 
preparation ai in the box such that the system (which we take to be of type a) 
will emerge out of the aperture and such that the light will flash if the outcomes 
corresponding to this preparation are seen. The state prepared for this one i is 
Ajp^*. If Aj = then the state prepared is the null state Oa which has all O's as 
entries. If we use this box for a set a, with i £ I such that J2iei — ^ then 
the state prepared is 

This is a linear sum of terms since we have linear compression. This process of 

using a box may be beyond the experimental capacities of a given experimen- 
talist. It certainly takes us outside the circuit model as previously described. 
However, we can always consider taking mixtures like this at a mathematical 
level. 

A technique that can be described in the circuit model is the following. 
Consider placing a single preparation circuit into the box described above where 

aj = {J-, if. Ij) where Ij labels the outcomes. We can arrange things so that the 
light flashes only if j g J (where J is some subset of the j's). The state is then 
given by 

This technique does not require having a coin to generate probabilities A^ and 
neither does it require the placing of different circuit fragments into a box de- 
pending on the outcome of the coin toss. 

The most general thing we can do is a mixture of the two above techniques. 
With probability A^ we place a circuit J^i with settings ipi and outcomes lij in 
the box for i G 7 and j G Ji- The state we obtain is 

E ^^p«^^' (16) 

ieijeJi 
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We can absorb the A's into the p's and relabel so we obtain that a general 
mixture is given by 

i 

We have the constraints that r^; • p"* > and r~ • p^' < 1. 

Note that if we write p^' = Hip^^ where p^' is normalized, and include an 
extra state p° = (the null state) then we have 

P" = 51 '"'Pa' where /ij > 0, ^ = 1 (18) 

i i 

where the sum now includes the null state. Hence we can interpret a general 
mixture as a convex combination of normalized states and the null state. 

5.2 Homogeneous, pure, mixed, and extremal states 

If two states are parallel then they give rise to the same statistics up to an 
overall weighting and if we condition on the preparation then they have exactly 
the same statistics. We define a homogeneous state as one which can only be 
written as a sum of parallel states. Thus p^ is homogeneous if, for any sum, 

P^ = EPa^ (19) 

we have p"* = Tja'Pa^' for all G I. A state which is not homogeneous (i.e. 
which can be written as a sum of at least two non-parallel states) is called a 
heterogeneous state. 

Given a particular homogeneous state, there will, in general, be many others 
which are parallel to it but of different lengths. We call the longest among these 
a pure state. 

A mixed state is defined to be any state which can be simulated by a prob- 
abilistic mixture of distinct states in the form J2j ^jPa' where Xj > and 

Xj = 1. A pure state is not a mixture (since the p"' 's would have to parallel 
to the given pure state, and therefore, given the A^- = 1 condition, equal to 
the given pure state). Homogeneous states which are not pure are mixtures. 
Heterogeneous states arc also mixtures. Extremal states arc defined to be states 
which are not mixtures. Pure states are extremal. The null state is also ex- 
tremal. If all pure states have norm equal to one (i.e. r~ • p" = 1) then there 
are no more extremal states beyond the pure states and null state. However, 
if some pure states have r~ • p^ < 1, then there may be additional extremal 
states. 

Usually treatments of convex sets of states do not make these distinctions. 
More care is necessitated here because states are based on joint rather than 
conditional probabilities. 

Any state, extremal or mixed, can be written as the sum of homogeneous 
states. This means that there must exist at least one set of \ fla\ linearly inde- 
pendent homogenous states all of which can be pure. There cannot exist sets 
with more linearly independent states than this. 
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5.3 Optimality of linear compression 



We state the following theorem 

Theorem 1 If we allow arbitrary mixtures of preparations then 
(1) linear compression is optimal and (2) optimal compression is 
necessarily linear. 

The first point follows since there must exist \^la\ linearly independent states 
(otherwise we could have implemented further linear compression). We can 
take an arbitrary mixture with < 1 then these A's are all independent 

and hence we need \ fla\ parameters and the compression is optimal. To prove 
the second point consider representing the state by a list of \Qa\ probabilities 
with [3' G ilj, where we do not demand that a general probability is given 
by a linear function of these probabilities. Represent this list by the vector q^. 
Now 

p"^'=rf.p^ for all/3' er!l (20) 

Hence q" = Cp" where C is a sqiiare matrix with real entries. C must be 
invertible since otherwise we could specify with fewer than \ fla\ probabilities. 
Hence 

= rf ■ C-'p: (21) 

for a general /?. Hence the probability is linear in q^ and so the compression is 
linear. 



6 Composition 

6.1 Preliminaries and notation 

As we discussed above, systems associated with more than one wire can be 
thought of as composite. The p, r, Z framework just discussed can be enriched 
by adapting it to deal separately with the components of composite systems 
(rather than regarding all the wires at each time step as constituting a single 
system). The advantage of this is that we can break up the calculation into 
smaller parts and thereby define a theory by associating matrices to smaller 
transformations. Ultimately, we would like to have a matrix associated with 
each operation in the set of allowed operations which can be used to calculate 
the probability for any circuit. A transformation is now associated with a matrix 
such as 

This transformation inputs a system of type acb and outputs a system of type 
bacd. The label a denotes the circuit fragment used to do this including the 
knob settings and the outputs at each operation in the fragment. The ordering 
of the symbols representing the system (such as bacd) is significant in that it 
is preserved between transformations to indicate how the wires are connected. 
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Thus, the matrix for a transformation comprised of two successive transforma- 
tions is written 

'^bacd ^acb (,^"5; 

The system types in between the two transformations must match (as in this 
example) since we empfoy the convention that the wires are in the same order 
from the output of one transformation to the input of the next. We can aUow 
that the symbols for the systems (such as c) actually correspond to composite 
systems (so that they correspond to a cluster of wires). For example, it might 
be that c = aabb. Some transformations consist of disjoint circuit fragments 
and it is useful to have notation for these. We write 

(.)(ae)(-)^gg)g) (24) 

to indicate that the transformation consists of three disjoint parts, one trans- 
forming from system type a to 6 and labeled by a, one transforming from c to 
ac and labeled by P and which inputs b and has no open outputs (which we 
denote by — when necessary for disambiguation) and labeled by 7. If it is clear 
from the context we will sometimes use the less cumbersome notation 

cdyaP _ (c)(d)y(a)(P) /QCN 
^ab = '^{a)(b) \'^^> 

We may sometimes want to depart from the convention that the wires are in 
the same order from one transformation to the next in which case we label the 
wires (and wire clusters as appropriate) using intergers 1,2,... as follows 

(b)5(ac)64(d)72-(")('5)(T) ("26) 

(Q)l(c)3(b)2 ^ 

In this example wire 6 is an output wire of type a. We can then rewrite (|23p as 

(d)87(/3) (bacd) 

^idcab)res4 '^iacb)i23 ' ' 

We see that the wires match (for example wire 6 is of type c as an output from the 
first transformation and an input into the second transformation) . The integers 
labeling the wires are of no significance and any expression is invariant under 
any reassignment of these labels (this is a kind of discrete general covariance) . 



6.2 Commutation 

Consider the situation shown in Fig. [51 By inspection of this diagram we can 
write 

cdr^af3 cdr^Op cbr^aQ cdr^aO adryOfS fOQ\ 

^ab — ^cb ^ab — ^ad ab 

where the denotes that we do nothing (the identity transformation). The 
first equation here is obtained by first transforming from hyperplane Hi to H2 
(past operation A) and then from H2 to i?4 (past operation B). To get the 
second equation we evolve from Hi to H-^ first (past B) then from H^ to H^ 
(past A). There are a few points of interest here. First note that we can break 
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Figure 8: We can consider the evolution of this circuit with respect to different 
foliations. 



down a compound transformation into its parts. Second, we see that there 
is a kind of commutation - it does not matter whether we update at A or B 
first. This property is not assumed but derived from more basic assumptions 
and definitions. In the special case where wires c and d are of type a and b 
respectively (so the transformations do not change the system type) we have 
the commutation property 

i ^ab^ ^ab\= ^ab ^ ab ~ ^ab ^afc ^ ^ V^^J 

The usual commutation relation is, then, a special case of the more general 
relation in (|28p where the local transformations may change the system type. 

The fact that we can break up a compound transformation into smaller 
parts is potentially useful. But there is a stumbling block. The matrix '^^Z'^^ 
transforms past the A operation. However, it is a \VLcb\ x \^ab\ matrix. That is, 
we still have to incorporate some baggage because we include wire b. It would 
be good if we could write 

cb^aO ^ c^o ^ b^O 7 (30) 

where ^Z^ is just the identity matrix (as discussed above). By considering the 
sizes of the matrices it is clear that ((30|) implies \flab\ — \^a\\^b\- In Sec. 17.1. 1] 
we show that equation ([30|) holds in general if \ftab\ — \^a\\^b\ is true (for any 
system types a and 6). We will also see that this condition corresponds to a 
very natural class of physical theories. If ([30|) holds true we can break any 
circuit down into its basic operations appending the identity transformation as 
necessary. Then we can calculate the probability associated with any circuit 
from the transformation matrices associated with the operations. 
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Figure 9: A bipatite system 12 of type ab is prepared by some preparation 7 and 
subjected to effects ai and f3i on subsystem 1 and subsystem 2 respectively. 

6.3 Homogenous states and composite systems 

Consider a composite system 12 consisting of systems of types a and b with 
preparation C labeled 7. The state prepared by this is p^^- If we block the 
output 2 of the preparation then we have a preparation for a system of type a. 
Let the state so prepared be p2- Even if we do not block output 2 it follows from 
part (ii) of the condition for closable sets of operations that this state gives 
us the correct probabilities for all measurements on system 1 alone (that do not 
involve system 2). We call p2 the reduced state for system 1. It is, effectively, 
the state of system 1 taken by itself. 
We prove the following theorem. 

Theorem 2 If one component of a bipartite system is in a homoge- 
neous state (i.e. the reduced state for this system is homogeneous) 
all joint probabilities for separate effects measured on the two sys- 
tems factorize. 

Systems 1 and 2 can be subjected to measurements A and B respectively (see 
Fig. [9]). We also consider the possibility of closing either or both outputs from 
C. Let measurement A have outcomes i ^ I the effects for which are labeled 
ai = {J^'^ , (f^ , if-) . Similarly, for B we have outcomes if for j d J and effects 
labeled by /3i = {T^ , (p^ , if). By part (ii) of the condition for a closable set of 
operations we have 

p^— = ^p^-'-=^p'^-ft (31) 
iei jeJ 

and 

p7-ft=^p7a.ft (32) 

If the output 2 from C is closed then we say that the state prepared (for the 
system of type a) is (the reduced state). We say that the preparation due 
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to C and B with outcome if (this is a preparation circuit fragment) is pZ^^ ■ It 
follows from part (ii) of the condition for a closable set of operations that 

Pl = Y.'Pf' (33) 

If p2 is homogeneous then all the Pa states must be parallel to it. Hence we 
can write pa ^ = 'yjP^- We must have 

p7o.ft = r^' . pf ^ = r;,r^' • p^ = (34) 
Summing this over i and using (|31i I32p we obtain p''^^^ — rjjp'' . Hence 

p7".ftp7— ^p7a.-p7~ft (35) 

Here p'^ is the probability of the preparation being successful. Dividing this 
through by {p'^ )^ and using Bayes' rule we obtain 

prob(Zf ;f Iprep) = prob(;,^|prep)prob(;f |prep) (36) 

Hence we see that if one system is in a homogeneous state then joint proba- 
bilities factorize between the two ends (obviously the result also holds if both 
components are in homogeneous states). An obvious corollary is 

Corollary 1 If the state of a bipartite system 12 of type ab with 
preparation 7 is of norm one and the reduced state of either or both 
components is pure then 

^ p7ap7/3 (37) 

where a is any effect on 1 alone and (3 is any effect on 2 alone. 

This is true since pure states are necessarily homogeneous and because p'^ = 1 
since the state is of norm 1. 

Now we consider a related but slightly different situation. Imagine we have a 
preparation consisting of two disjoint parts one of which prepares a homogeneous 
state. 

Theorem 3 If a preparation, consists of two disjoint circuit 
fragments 7 and i5 which prepare systems of type a and b respec- 
tively, and one of these circuit fragments, 7, taken by itself prepares 
a homogeneous state, p^, then the state prepared by closing the 
outputs of the second circuit fragment of •yS is parallel to p^ (and 
therefore also homogeneous). 

The proof of this theorem is based on the same idea as the previous theorem. 
We can put S = {J-,ip,l) where T denotes the actual circuit fragment, ip the 
settings, and / the outcomes. Then we can put 5 = {J-, ip, I). This is the circuit 
fragment associated with not seeing I. Either I or I must happen, and hence 

p2^p2' + p2' (38) 

Since p2 is homogeneous both p^'^ and p^'^ must be parallel to it and hence are 
also homogeneous. 
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6.4 Fiducial measurements for composite systems 

Assume that system 1 is prepared by some preparation7 in a homogeneous state 
pj and, similarly, system 2 is prepared in homogeneous state p^. Assume these 
are two separate preparations corresponding to two disjoint circuit fragments. 
Hence we can consider the joint preparation giving rise to the state p^f . 

A natural question is, what is the relationship between the states pj, pf, 
and p^fc? By virtue of Theorem 3 we know that the reduced state at either end 
is homogeneous and hence, by virtue of Theorem 2 (equation (|35|) in particular) 
and the fact that the subsystems are in homogeneous states we can write 

Kb ■ Pit - P"'"^ - = (r^ ) . (pZ ® p^) (39) 

where we obtain 

„7<5 

-.s = (40) 

by putting a — — and {3 — 

We know that there must exist Ka = \^a\ linearly independent homogeneous 
states for system 1 (they can all be chosen to be pure). Let 7 S fia be the 
preparations associated with one such set of linearly independent homogeneous 
states for system 1 (where \^a\ — \^a\)- Likewise we have Kh = linearly 
independent homogeneous states for system 2 with preparations 5 E (where 
\p.b\ = \nb\). The KaKb vectors 

pZ ® pI for e f^a X fib (41) 

are linearly independent and, similarly, the KaKh vectors 

rf for ajS € ila x Qi, (42) 

are linearly independent. It follows from (|39p and some simple linear algebra 
that the KaKj, vectors 

Pit for 7(5 e Ha X (43) 
are linearly independent as are the KaKh vectors 

r;^f for a/SenaX Qb (44) 
From this we can prove the following 

Theorem 4 For composite systems we can choose V.ab such that 

r^a X c nab (45) 

This theorem has an immediate corollary: 

Corollary 2 For composite systems Kab > KaKb 



25 



Here Ka — \ fla\- This inequality follows from fact that we have at least KaKi, 
linearly independent states in (|43)) . The set relation ([45]) follows since the effects 
in are linearly independent and hence that we can choose KaKb of the Kab 
fiducial effects in riab to correspond to local effects. By a local effect we mean 
one comprised of disjoint circuit fragments, one on system a and one on system 
b. It follows from that we can write 

i^ab = i^ab U flab whcrC flab = X ^b (46) 

The fiducial effects in Clab are local. Hence, we can write a general bipartite 
state, with preparation e, as 

Pab - Pab ® Pab (47) 

where the elements of p^j, are the probabilities corresponding to effects in Clab 
and the elements of p^j, are the probabilities corresponding to the effects in 
Clab- Note that pj^ lives in the tensor product space of the vector spaces for 
component systems because Clab = Cla x Clh. If systems 1 and 2 are both in 
homogeneous states then it follows from equation that 

Plt^'^fSPl^pt (48) 

A similar result holds even if only one system is in a homogeneous state (this 
follows from theorem 1). These results for bipartite systems generalize in the 
obvious way to composites having more than two component systems (for three 
systems we use Clabc = {Cla x Clb x Qc) U Clabc)- 

It is easy to see that if each system is subject to its own local transformation 
(so the circuit fragments corresponding to the transformations are disjoint) then 
the state updates as 

Plf = [(X ® XWab] © ''Kb^Plb (49) 

where the form of the '^'^Z^j^ matrix depends, in general, on the particular theory 
(it acts on p^^ to give p^d'')- This equation follows by considering the case 
where both systems are in homogeneous states. Then we have KaKb linearly 
independent vectors p^^f — fi-yspj pf with {-fS G Cla x Clb)- The p part of the 
state must remain as a tensor product like this after the local transformations 
to ensure consistency with equation p9p (notice that if it did not then there 
would exist a correlation- revealing measurement contradicting Theorem 2). But 
the vectors p2 (X" pf span the space of possible Pa vectors and so ([^^ must be 
true generally. 

If system d is the null system (so the transformation on system 2 is an effect) 
then the Z matrix has no elements and we can write 

= {'z: -zf )ps, (50) 

(Note that if the Z matrix did have elements then the two sides of this equation 
would be column vectors of different lengths.) In particular, the reduced state 
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of system 1 is given by 

pl^pf- = rz",®ir^f]pl, (51) 

(recall that '^Z^ is the identity). Hence the reduced state at either end depends 
only on the elements in the part of p^;,. 

If both systems c and d are null then ([50)1 becomes 

p-'3^(r^®rf)-p^, (52) 

This also follows directly from (|39p and the fact that the vectors p^ (g) pf span 
the space of possible Pa vectors. This equation tells us that all local effects 
are linear combinations of the fiducial effects corresponding to the Clab part of 
flab- Hence, all effects r^^ with 7 G flab sue nonlocal - the corresponding circuit 
fragments cannot consist of disjoint parts acting separately on a and b. Theories 
in which the state can be entirely determined by local measurements are called 
locally tomographic. This gives us an important theorem 

Theorem 5 Theories having Kab = KaKb are locally tomographic 
and vice-versa. 

This corresponds to the case where Clab is the null set. 

6.5 Homogeneity and uncorrelatability are equivalent no- 
tions 

We define: 

An uncorrelatable state is one having the property that a system 
in this state cannot be correlated with any other system (so that any 
joint probabilities factorize). 

Let p2 be an uncorrelatable state. Let p^^ be a state for a bipartite system 
having the property that its reduced state is p^. If the bipartite system is pre- 
pared in any such state then the joint probabilities will, by definition, factorize. 
We will prove 

Theorem 6 If we allow arbitrary mixtures then all homogeneous 
states are uncorrelatable and all uncorrelatable states are homoge- 
neous. 

That is homogeneity and uncorrelatability are equivalent notions. It follows 
immediately from Theorem 2 that homogeneous states are uncorrelatable. To 
prove that uncorrelatable states are homogeneous we assume the converse. Thus 
assume that the heterogeneous state p^ = p^^ -|- pp (where the two terms are 
non-parallel) is uncorrelatable. Since the state is assumed to be uncorrelatable 
we must be able to write the state of the composite system as 

pL-A*7[p2®P?]©pL (53) 
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since otherwise the probabiHty does not factorizc for all a/? € fla x i^b- Hence 

Plb = [Pa ® Pj] ® PL + ^^7 [P? ® p2] ® PL (54) 

But, using ([5T|) . we see that the reduced state, p^, for system 1 of (|54l) is parallel 
to the reduced state, , of 

Pll = [pT ® P'^] ® Pll'' + f^,.s, [pT ® P'l © P^b'^ (55) 

where we choose any two distinct p^^ and p^^ having normalization such that 
M7ii5i = M72'52- It is possible to choose two distinct states like this if there 
exist systems that are non-trivial in the sense that they require more than one 
fiducial effect. (If all systems are trivial then all states are homogeneous and 
so, by Theorem 2, all states are uncorrelatable in any case.) We allow arbitrary 
mixtures and so can take mixtures with the null state to make sure p^^ and p^^ 
have normalization so that ^^^Si — M72<52 ■ The state (|55p is preparable by taking 
a mixture of the preparations 71^1 and 72^2- This state is clearly correlated. By 
taking a mixture with the null state for the longer of p^ and p^ we obtain two 
equal states one of which is uncorrelatable by assumption and one of which is 
correlatable by the above proof. Hence our assumption was false and it follows 
that uncorrelatable states are homogeneous. 

6.6 Probabilities for disjoint circuit 

If we have two disjoint setting-outcome specified circuits, a and /3, then expect 
the joint probability to factorize 

p"/3 = p"p/3 (56) 

A simple application of Bayes' rule shows that (j56p is equivalent to demanding 
that the probability associated with a circuit is independent of the outcomes 
seen at other disjoint circuits. This is an extremely natural condition since 
otherwise we would have to take into account all the outcomes seen on all other 
disjoint circuits in the past which form a part of our memory before writing 
down a probability for the circuit. On the other hand, one can easily envisage a 
situation in which the probability is not independent of outcomes elsewhere. For 
example, the eventual outcome of a spinning coin might be correlated with the 
outcome of an apparently disjoint experiment which is, incidently, influenced 
by photons scattered from the coin while it spins. More generally, if there are 
hidden variables, then there may be correlations between outcomes even though 
the marginals are independent of what happens at the other side. It is not clear 
that disjointness of the circuits is enough to prevent such correlations. In view 
of this, it is interesting that the following theorem holds. 

Theorem 7 If there exists at least one type of system which can be 
prepared in a pure state of norm one then the probability associated 
with any circuit is independent of the settings on any other disjoint 
circuit. 
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Let the preparation associated with this pure state of norm one be 7. Consider 
the circuit (7— )(7— )(q;)(/3) consisting of two instances of the circuit obtained 
by performing the trace effect on a preparation 7, and two more disjoint circuits 
a and /?. Thus we have four disjoint circuits in total. We can regard this circuit 
as consisting of the effect —a on one of the 7 preparations and the effect — /3 on 
the other 7 preparation. Then we have 

p(7-a)(7-/3)p(7-)(7-) — ■p^^°'-p^~P (57) 

by Theorem 2 (equation ([55)1 in particular). But jp^ — 1 since the state is of 
norm one. Hence, using Bayes' rule, (j56p follows and the theorem is proved. 

We say that a set of operations is jully closable if it is closable and if the 
probability for a circuit is independent of the outcomes seen at other disjoint 
circuits. It follows from Theorem 7 that closable sets of operations admitting 
at least one pure state of norm one are fully closable. In the case that we have 
a fully closable set of operations it is clear that we can write the Pab part of the 
state associated with disjoint preparations as 

PL' = P2®pf (58) 

It is interesting to note that (|48p is an example of this with = 1 which 
clearly follows from (I40|) when probabilities for disjoint circuits factorize. 



6.7 Examples of the relationship between Kab, Ka, and K}, 

If Na is the number of states that can be distinguished in a single shot mea- 
surement then it is reasonable to suppose Nab = NaNb- This is true in all the 
examples we will discuss. In classical probability theory Ka = Na- In quantum 
theory Ka — Na- Hence Kab — KaKb and so, by Theorem 5, we have local 
tomography in these theories. In real Hilbert space quantum theory, where 
the state is represented by a positive density matrix with real entries, we have 
Ka^ Na + NiN - l)/2!. This has Kab > KaKb which is consistent with Corol- 
lary 2. However, quaternionic quantum theory has Ka = Na + 4:N{N — l)/2! 
which has Kab < KaKb- This is inconsistent with Corollary 2 and hence quater- 
nionic quantum theory cannot be formulated in this framework. Since we have 
made very minimal assumptions (only that we have closable sets of operations in 
an operational framework) it seems that quaternionic quantum theory is simply 
an inconsistent theory (at least for the finite Ka case considered here). 



7 Theories for which Kab = KaK^ 
7.1 Motivation for local tomography 

Of the examples we just considered, the two corresponding to real physics are 
both locally tomographic having Kab = KaKb- This is a very natural property 
for a theory to have (it is one of the axioms in fBl). It says that, from a counting 
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point of view, no new properties come into existence when we put two systems 
together. It allows a certain very natural type of locality so that it is possible 
to characterize a system made from many parts by looking at the components. 
It implies that the full set of states for a composite system requires the same 
number of parameters for its specification as the separable states (formed by 
taking mixtures of states prepared by disjoint circuits). Given that this is such 
a natural constraint we will study it a little more closely. We will also give 
axioms for classical probability and quantum theory since they are examples of 
this sort. 

7.1.1 Operation locality 

An extremely useful property of locally tomographic theories is that they are 
local in the sense that the state is updated by the action of local matrices at 
each operation. We will call this property operation locality. We see from (|47|) 
that if riab — then p^^ = and p^j, = p^^ and hence according to (|49|l we see 
that under local transformations at each end (corresponding to disjoint circuit 
fragments) the state will update as 

l>Zf = {'^Z:®X)vlb (59) 

Hence 

= X ® (60) 

for transformations corresponding to disjoint circuit fragments. In particular 
this implies 

= X ® %° (61) 

This is equation (j30p we speculated about in Section 16.21 This means an op- 
eration has a trivial effect on systems that do not pass through it. If we have 
a fully closable set of transformations (as long as there exists a least one state 
of norm-one this follows from Theorem 7) then we can specialize this equation 
to the case of null input states (where a — ~ and/or h — ~) since the state 
prepared by disjoint preparations is a product state. We will assume this in 
what follows. 

The great thing about (|60|) is that it can be used to calculate the probability 
for any circuit using a Z matrix for each operation. To do this we choose a 
complete foliation and then use the tensor product to combine operations at 
each time step. One way of calculating the probability p^'^'^'^^^for the example 
shown in Fig. [TO] is 

^^(■^^/'^ %"<i)(^'^/® "Z^^ ''Z^)i/zi(g,''Z°® 'Z°^(g, ''Z0)("''Z"® ^-^Z^) (62) 

While it is very satisfying that the calculation can be broken down like this, 
it is unfortunate that we have to pad out the calculation with lots of identity 
matrices like "^Z^. This means that there are more matrices than operations in 
this calculation. Relatedly, we have to be very careful what order we take the 
product of all these matrices (it has to correspond to some complete foliation). 
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Figure 10: We show how to calculate the probability associated with this simple 
example in the text using a Z matrix for each operation. 

In the causaloid approach [H [7l [5] we will have neither of these problems. We 
simply take what is called the causaloid product of a vector associated with 
each operation without regard for the order and without having to pad out the 
calculation with identity matrices. 

7.2 Classical probability theory 

It is very easy to characterize classical probability theory in this framework. It 
is fully characterized by the following two axioms: 

Composition Kah — KaKi, 

Transformations Transformation matrices, "^Z", have the property that the 
entries are nonnegative and the sum of the entries in each column is less 
than or equal to 1. 

To see this is equivalent to usual presentations of classical probability theory 
note the following. We can interpret Ka = Na as the maximum number of 
distinguishable states for this classical system (for a coin we have Na = 2, for 
a die iV;, = 6). The state is given by a p" = "Z" and is a column vector. The 
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sum of the entries in this vector must be less than or equal to 1. They can 
be interpreted as the probabilities associated with each of the distinguishable 
outcomes (for a die they are the probabilities associated with each face). The 
trace effect is given by (r~ ) = Z^" . It is a row vector. The value of each entry is 1 
and this is consistent with the constraint that the sum of the columns cannot be 
greater than 1. Norm preserving transformations are stochastic matrices. Since 
Kab ~ KaKb, wc have the operation locality property and so we can calculate 
the probability for an arbitrary circuit from matrices for the operations that 
comprise it. 



7.3 Quantum theory 

To give the rules for Quantum Theory we need a few definitions first. Let Tipf^ 
be a complex Hilbert space of dimension Na ■ Let V7v„ be the space of Hermitian 
operators that act on this. All positive operators are Hermitian. Furthermore, 
it is possible to find a set of linearly independent positive operators that 
span Vat^. Let for a € be one such set. Define 



/ : \ 

pa 

\ '■■ J 



ae^a (63) 



A positive map from Vjv,, to Vn^ is one which acts on a positive operator 
pa S Vn^ E^nd returns a positive operator p'^ G Vn^ for any positive operator pa- 
The map '^$a is completely positive if '^$a <8 ''h is a positive map from Vat^at^ 
to VjVciVi,for any b where ^/f, is the identity map on Vn^- Further, we want our 
maps to have the property that they do not lead to probabilities greater than 1. 
We will demand that they must be completely trace non-increasing when they 
act on density matrices. This means that "^$a ''/{, must be trace non-increasing 
for any b. Quantum theory is fully characterized by the following two axioms. 

Composition Kat = KaKb 

Treinsformations Transformation matrices are of the form 

^Z^ = Trace(Pe 'Ki^D) [Trace(P„P^)] (64) 

where '^S" is completely positive and completely trace non-increasing and 
T denotes transpose. 

This is a much more compact statement of the rules of quantum theory than is 

usually given. We will make a few remarks to decompress this. First note that 
Trace(Pc "Ki^D) is a iCc x Ka matrix having element Trace(Pf (^J)) 
with (3 gQc and 7 € Oq. By defining 

pi = Trace(Papf ) (65) 
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where is the usual quantum state, and using =Pa ■ s" (since pa must be 
given by some sum of the Unearly independent spanning set), it can be shown 
after a few hues of algebra that 

p^ = ^Z>f ^ pi"=X(pl) (66) 

Hence we get the correct transformations with (|64l) . Note that, as in the clas- 
sical case, we can write a state as p" = ""Z". This state is associated with a 
completely positive map where the absence of an input label implies that 
we have a null input system which corresponds to a one dimensional Hilbert 
space. This must have trace less than or equal to one since otherwise °$ ^/f, 
would be trace increasing (this is the reason we impose that the map should 
be completely trace non-increasing rather than just trace non-increasing). Also 
note that for (the identity map) we get the identity for in ((64|) as we 
must. The composition rule Kab — KaKi, implies that Nab — NaNb (since we 
see from inspection of the rank of the matrices that |ria| = N^) and hence the 
tensor product structure for Hilbert spaces corresponds to the tensor product 
structure discussed in Sec. 17.1.11 The fact that we have Kab — KaKb means 
that we have the operation locality property. We can calculate the probability 
for a general circuit using these Z matrices for each of the operations. Hence 
our list of postulates for Quantum Theory is complete. 

7.4 Reasonable postulates for quantum theory 

The objective of this chapter has been to set up a general probabilistic frame- 
work. It is worth mentioning that we can give the following very reasonable 
postulates which enable us to reconstruct quantum theory within this frame- 
work. 

Information Systems having, or constrained to have, a given information car- 
rying capacity have the same properties. 

Composites Information carrying capacity is additive and local tomography 
is possible (i.e. Nab = NaNb and Kab = KaKb). 

Continuity There exists a continuous reversible transformation between any 
pair of pure states. 

Simplicity Systems are described by the smallest number of probabilities con- 
sistent with the other postulates. 

We can show from the first two postulates that K — N"^ where r = 1,2,.... 
The continuity postulate rules out the classical probability case where K — N. 
The simplicity postulate then implies that we have K — iV^. We construct the 
Bloch sphere for the N — 2 case using, in particular, the continuity postulate. 
Then the information postulate and composites postulate are used to obtain 
quantum theory for general N. We refer the reader to O [8] for details. 
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8 Conclusions 



We have exhibited a very natural framework for general probabilistic theories 
in an operational setting. We represent experiments by circuits and have been 
particularly careful to give operational interpretations to the elements of these 
circuits (operations are single uses of an apparatus and wires represent apertures 
being placed next to one another). By considering closable sets of operations 
we are able to introduce probabilities and then to set up the full theory wherein 
Z matrices are associated with circuit fragments. The special case of locally 
tomographic theories has the operation locality property so that we can com- 
bine Z matrices corresponding to circuit fragments that are in parallel using the 
tensor product. This enables us to break down a calculation into smaller parts. 
The framework here is still lacking. Most crucially, it is only able to take into 
account one particular way in which operations can be connected (correspond- 
ing to placing apertures next to each other) but there are many other ways. A 
more general theory is under development to allow more for other types of con- 
nections (see J5] ) . The framework is discrete and hence is not readily adaptable 
to quantum field theory. It would be very interesting either to develop a con- 
tinuous version or to show how quantum field theory can be fully understood in 
such a discrete framework. Algebraic quantum field theory can be understood 
in operational terms (see for example Haag [4^). However, putting the issue 
of discreteness aside, it is a rather less general operational theory than that 
presented in this chapter and so there may be advantages to studying quantum 
field theory in the framework presented here. It is worth saying that there is a 
tension between operationalism and use of the continuum in physics. From an 
operational point of view, the continuum is best understood as a mathematical 
tool enabling us to talk about a series of ever more precise experiments. It 
is possible that such a series may, eventually, be better described with other 
mathematical tools. 

There are two types of motivation for considering general probabilistic the- 
ories. First, we may be able to better formulate and understand our present 
theories within these frameworks. It may be possible to write down a set of pos- 
tulates or axioms which can be used to reconstruct these theories within such a 
framework. For the case of quantum theory there has been considerable work of 
this nature already. It would be interesting to see something similar for general 
relativity. In particular, there ought to be a simple and elegant formulation of 
general relativity for the case where there is probabilistic ignorance of the value 
of quantities that might be measured in general relativity, (let us call this proba- 
bilistic general relativity). Such a theory might be best understood in a general 
probabilistic framework (though probably more general than the one presented 
in this chapter) [5]. The second reason to consider general probabilistic theo- 
ries is to try to go beyond our present theories. The most obvious application 
would be to work towards a theory of quantum gravity (see [11 [7]). The pro- 
gram of constructing general probabilistic theories and then constraining then 
using some principles or postulates may free us from the hidden mathematical 
obstacles to formulating quantum gravity that stand in the way of the more 
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standard approaches such as string theory and loop quantum gravity. 

Acknowledgements 

Research at Perimeter Institute for Theoretical Physics is supported in part by 
the Government of Canada through NSERC and by the Province of Ontario 
through MRI. I am grateful to Vanessa Hardy for help with the manuscript. 



References 

A. Einstein. Zur Elektrodynamik bewegter Korper. Annalen der Physik^ 17 
(1905) 891-921. 

C.A. Fuchs. Quantum mechanics as quantum information 
(and only a little more). arXiv:quant-ph/0205039| (2002). 



http:// arxiv.org/abs/quant-ph/0205039! 



Conference: Reconstructing quantum theory, organised by Philip Goyal 
and Lucien Hardy. PIRSA:C09016 (2009) . |http: / /pirsa.org/c09016, 

L. Hardy. Probability theories with dynamic causal structure: A 
new framework for quantum gravity. |arXiv:gr-qc/0509120| (2005). 



http://arxiv.org/abs/gr-qc/0509120 



L. Hardy. Operational structures as a foundation for probabilistic theories. 
PIRSA:09060015 (2009) . http: / /pirsa.org/09060015 , 

L. Hardy. Quantum theory from five reasonable axioms. 
|arXiv:quant -ph/0101012| (2001). |http://arxiv.org/abs/quant-ph/0101012[ 



L. Hardy. Towards quantum gravity: A framework for probabilistic theories 
with non-fixed causal structure. Journal of Physics, A40 (2007) 3081-3099. 

L. Hardy. Operational structures and natural postulates for quantum the- 



ory. PIRSA:09080011 (2009). http://pirsa.org/09080011 



R. Sorkin. Spacetimc and causal sets. In J. DOlivo et. al., editor, Relativity 
and Gravitation: Classical and Quantum. World Scientific, (1991). 

R.B. Griffiths. Consistent histories and the interpretation of quantum me- 
chanics. Journal of Statistical Physics, 36 (1984) 219-272. 

M. Gell-Mann and J.B. Hartle. Classical equations for quantum systems. 
Physical Review D, 47 (1993) 33453382. 

R. Omnes. The Interpretation of quantum mechanics. Princeton Univ. 
Press, 1994. 



35 



[13] J.B. Hartle. Spacetime quantum mechanics and the quantum mechanics 
of spacetime. Gravitation and Quantizations: Proceedings of the 1992 Les 
Houches Summer School, ed. by B. JuHa and J. Zinn-Justin, North Hohand, 
Amsterdam, (1995). 

F. Markopoulou. Quantum causal histories. Classical and Quantum Grav- 
ity, 17 (2000) 20592077. 

R.F. Blute, I.T. Ivanov, P. Panangaden. Disgrete quantum causal dynam- 
ics. International Journal of Theoretical Physics 42 (2003) 2025-2041. 

P. Panangadan. Discrete quantum causal dynamics. PIRSA:09060029 
(2009). http://pirsa.org/09060029/, 

M. Leifer. Quantum causal networks. PIRSA:06060063 (2006). 



http: / /pirsa.org/0606006^ 



S. Abramsky and B. Coecke. A categorical semantics of quantum protocols. 
Proceedings of the 19th Annual IEEE Symposium on Logic in Computer 
Science (LICS 04), (2004) 415425. 

B. Coecke. Where quantum meets logic, ... in a world of pictures! 
PIRSA:09040001 http: //pirsa.org/09040001 

G. Mackey. Mathematical Foundations of Quantum Mechanics. Benjamin 
(1963). 

G. Ludwig, An Axiomatic Basis of Quantum Mechanics volumes 1 and 2. 
Springer- Verlag (1985, 1987). 

E. B. Davies and J. T. Lewis, An operational approach to quantum prob- 
ability. Communications of Mathematical Physics 17 (1970) 239-260. 

H. Araki. On a characterization of the state space of quantum mechanics. 
Communications of Mathematical Physics 75 (1980) 124. 

S. Gudder, S. Pulmannova, S. Bugajski, and E. Beltrametti. Convex and 
linear effect algebras. Reports on Mathematical Physics 44 (1999) 359-379. 

D. J. Foulis and C. H. Randall, Empirical logic and tensor products, in 
H. Neumann, (ed.). Interpretations and Foundations of Quantum Theory, 
Bibliographisches Institut, Wissenschaftsverlag, Mannheim (1981). 

J. Barrett. Information processing in generalized probabilistic theories. 
Physical Review A 75 (2007) 032304. 

H. Barnum, J. Barrett, M. Leifer, and A. Wilce. Cloning and broadcast- 
ing in generic probabilistic models. arXiv.org quant-ph/0611295, (2006). 
http://arxiv.org/abs/061 1295[ 



36 



[28] H. Barnum, J. Barrett, M. Leifer and A.Wilce, A general no-cloning theo- 
rem, Phys. Rev. Lett. 99 240501 (2007). 

[29] H. Barnum, J. Barrett, M. Leifer, and A. Wilce. Teleporta- 
tion in general probabilistic theories. arXiv.org:0805.3553, (2008). 
http://arxiv.org/abs/0805.3553 

[30] H. Barnum and A. Wilce. Information processing in convex operational 
theories. Proceedings of QPLV-DCMIV, Electronic Notes in Theoretical 
Computer Science (2008). 

[31] H. Barnum and A. Wilce. Ordered linear spaces and categories as frame- 
works for information-processing characterizations of quantum and classical 
theory. [arXiv:0908.2354l (2009). |http://arxiv.org/abs /0908.2354| 

[32] W.K. Wootters. Local accessibility of quantum states, in Complexity, en- 
tropy and the physics of information edited by W. H. Zurek (Addison- 
Wesley, 1990) 

[33] W.K. Wootters, Quantum mechanics without probability amplitudes. 
Foundations of Physics 16 (1986) 391. 

[34] S. Popescu and D. Rohrlich. Nonlocality as an axiom. 
Foundations of Physics 24 (1994) 379. 

[35] M. Pawlowski, T. Paterek, D. Kazlikowski, V. Scarani, A. Winter and M. 
Zukowski, et al. A new physical principle: Information causality. Nature 
461 (2009) 1101. 

[36] D. Gross, M. Mueller, R. Colbeck, O.C.O. Dahlsten. All reversible dynam- 
ics in maximally non-local theories are trivial. [arXiv:0910.1840k fl (2009). 
http://arxiv.org/abs/0910.1840l 

[37] CM. D'Ariano. How to Derive the Hilbert-Space formulation of quan- 
tum mechanics from purely operational axioms. arXiv:quant-ph/060301l| 
(2006). [http://arxiv.org/abs/quant-ph/0603011l 

[38] G. Chiribella, G.M. D'Ariano, P. Perinotti. Probabilistic theories with pu- 
rification. farXiv:0908. 15831 (2009). |http://arxiv.org/abs/0908.1583| 

[39] D. Gillies, Philosophical theories of probability, Routledge (2000). 

[40] R. Haag. Local quantum physics: Fields, particles, algebras. Springer 
(1992). 



37 



